Gaussian Copula Precision Estimation with Missing Values

نویسندگان

  • Huahua Wang
  • Farideh Fazayeli
  • Soumyadeep Chatterjee
  • Arindam Banerjee
چکیده

We consider the problem of estimating sparse precision matrix of Gaussian copula distributions using samples with missing values in high dimensions. Existing approaches, primarily designed for Gaussian distributions, suggest using plugin estimators by disregarding the missing values. In this paper, we propose double plugin Gaussian (DoPinG) copula estimators to estimate the sparse precision matrix corresponding to non-paranormal distributions. DoPinG uses two plugin procedures and consists of three steps: (1) estimate nonparametric correlations based on observed values, including Kendall’s tau and Spearman’s rho; (2) estimate the nonparanormal correlation matrix; (3) plug into existing sparse precision estimators. We prove that DoPinG copula estimators consistently estimate the non-paranormal correlation matrix at a rate of O( 1 (1−δ) √ log p n ), where δ is the probability of missing values. We provide experimental results to illustrate the effect of sample size and percentage of missing data on the model performance. Experimental results show that DoPinG is significantly better than estimators like mGlasso, which are primarily designed for Gaussian data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EM algorithm in Gaussian copula with missing data

Rank-based correlation is widely used to measure dependence between variables when their marginal distributions are skewed. Estimation of such correlation is challenged by both the presence ofmissing data and the need for adjusting for confounding factors. In this paper, we consider a unified framework of Gaussian copula regression that enables us to estimate either Pearson correlation or rank-...

متن کامل

Spatial Copula Model for Imputing Traffic Flow Data from Remote Microwave Sensors

Issues of missing data have become increasingly serious with the rapid increase in usage of traffic sensors. Analyses of the Beijing ring expressway have showed that up to 50% of microwave sensors pose missing values. The imputation of missing traffic data must be urgently solved although a precise solution that cannot be easily achieved due to the significant number of missing portions. In thi...

متن کامل

A Bayesian Approach to Inference and Prediction for Spatially Correlated Count Data Based on Gaussian Copula Model

Gaussian Copula has been successfully applied in spatially correlated count data due to its ability to completely model the high-dimensional dependence. In this article, we develop a Bayesian method to fulfill both parameter estimation and spatial prediction for spatially correlated count data set. A MCMC scheme (MetropolisCHastings Algorithm plus rejection sampling) is adopted to iteratively u...

متن کامل

Spatial Interpolation Using Copula for non-Gaussian Modeling of Rainfall Data

‎One of the most useful tools for handling multivariate distributions of dependent variables in terms of their marginal distribution is a copula function‎. ‎The copula families capture a fair amount of attention due to their applicability and flexibility in describing the non-Gaussian spatial dependent data‎. ‎The particular properties of the spatial copula are rarely ...

متن کامل

Multi-task Sparse Structure Learning with Gaussian Copula Models

Multi-task learning (MTL) aims to improve generalization performance by learning multiple related tasks simultaneously. While sometimes the underlying task relationship structure is known, often the structure needs to be estimated from data at hand. In this paper, we present a novel family of models for MTL, applicable to regression and classification problems, capable of learning the structure...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014